Profiling live mod_perl backends

am 29.03.2009 22:08:31 von Cosimo Streppone

Hi,

I have no experience in profiling mod_perl.
I'd like to use Devel::NYTProf to profile a production mod_perl backend.

There's an LVS load balancer in front of the mod_perl backends.
So far my idea is:

- identify 1 backend that I want to execute profiling on
- make sure it's idle (weight=0 in LVS)
- restart it with -X ??
- assign it some share of the production load
- let it run for "n" hours
- stop it
- collect profiling data

Is profiling mod_perl like this at all possible?
Does that make sense?

So far I've been profiling "statically" by having some
production backends write a special log file, letting
them run for 1 day, and then analyzing the log file
to gather data.

This method doesn't take into account stuff like TransHandlers,
and doesn't tell me which are the heaviest or most called
functions throughout all the code base.

So, any ideas? Or rather, what's the state-of-the-art of
profiling mod_perl apps?

--
Cosimo

Re: Profiling live mod_perl backends

am 29.03.2009 22:34:47 von Perrin Harkins

On Sun, Mar 29, 2009 at 4:08 PM, Cosimo Streppone wrote:
> Is profiling mod_perl like this at all possible?

Yes.

> Does that make sense?

No.

You'll get so much data that you won't be able to make heads or tails
of it. And profiling is heavy enough that people hardly ever do it on
production servers.

Just pick one type of request that seems slow, make that request on
your dev machine from your browser, and look at the results. This is
normally all it takes to find bottlenecks.

If you're concerned about finding which things are most frequently
called, just do standard log analysis to find out.

- Perrin

Re: Profiling live mod_perl backends

am 29.03.2009 22:44:46 von Cosimo Streppone

Perrin Harkins wrote:

> On Sun, Mar 29, 2009 at 4:08 PM, Cosimo Streppone
> wrote:

>> Is profiling mod_perl like this at all possible?
>
> Yes.
>
>> Does that make sense?
>
> No.
>
> You'll get so much data that you won't be able to make heads or tails
> of it. And profiling is heavy enough that people hardly ever do it on
> production servers.

Of course. But I could dedicate a spare backend to this for
a reasonable amount of time.

The main problem is that in the past we experienced some kind of
performance problems that only manifested themselves really clearly
in production and only at peak traffic hours.
Out of peak hours, everything was fine.

I'd like to avoid that in the future. In my case, it wasn't
obvious where to look for the root cause of this performance problem.

> Just pick one type of request that seems slow, make that request on
> your dev machine from your browser, and look at the results. This is
> normally all it takes to find bottlenecks.

Yes, this is what we already do.
But it's not enough.

At first, I tried to replicate production load, but it's
nearly impossible. Then I used siege and other tools to replay some
more or less "real" browser sessions, and that's ok.

However, some bad problems start to be evident only with really
high req/s rate.

> If you're concerned about finding which things are most frequently
> called, just do standard log analysis to find out.

Yes, of course. Thanks.

--
Cosimo

Re: Profiling live mod_perl backends

am 29.03.2009 22:52:33 von Perrin Harkins

On Sun, Mar 29, 2009 at 4:44 PM, Cosimo Streppone wrote:
> The main problem is that in the past we experienced some kind of
> performance problems that only manifested themselves really clearly
> in production and only at peak traffic hours.
> Out of peak hours, everything was fine.

That sounds like a problem with a shared resource like the database,
not something you'll find by profiling the code. You'd be better off
either using DBI::Profile or using logging on your database to find
the problem.

- Perrin

Re: Profiling live mod_perl backends

am 30.03.2009 13:45:56 von Carl Johnstone

Cosimo Streppone wrote:
> However, some bad problems start to be evident only with really
> high req/s rate.

Another possibility is that once you hit full load you're just pushing
the server over the edge for what it can manage load-wise. Use your
normal OS monitoring tools and ensure that you've not started to swap
excessively or have two many CPU-bound requests.

If it is the case, you'll probably find that you'll benefit from
reducing the number of concurrent apache processes.

Carl

Re: Profiling live mod_perl backends

am 30.03.2009 13:46:09 von Rolf Banting

--0016e6d588d2e71b5c046654a075
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On Sun, Mar 29, 2009 at 9:52 PM, Perrin Harkins wrote:

> On Sun, Mar 29, 2009 at 4:44 PM, Cosimo Streppone
> wrote:
> > The main problem is that in the past we experienced some kind of
> > performance problems that only manifested themselves really clearly
> > in production and only at peak traffic hours.
> > Out of peak hours, everything was fine.
>
> That sounds like a problem with a shared resource like the database,
> not something you'll find by profiling the code. You'd be better off
> either using DBI::Profile or using logging on your database to find
> the problem.
>
> - Perrin

There's a neat idea in "Perl Hacks" where you nominate every 100th (or
whatever) object as a debug object. Perhaps you could do something similar
with Devel::NYTProf - profile only every 'n' th request during peak times.

I assume you have checked "obvious" things like cpu & disk usage stats,
KeepAlive, MaxClients etc

--0016e6d588d2e71b5c046654a075
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Sun, Mar 29, 2009 at 9:52 PM, Perrin =
Harkins <pharkin=
s@gmail.com> wrote:

e=3D"border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; =
padding-left: 1ex;">

On Sun, Mar 29, 2009 at 4:44 PM, Cosimo Streppone < href=3D"mailto:cosimo@streppone.it">cosimo@streppone.it> wrote:

> The main problem is that in the past we experienced some kind of

> performance problems that only manifested themselves really clearly >
> in production and only at peak traffic hours.

> Out of peak hours, everything was fine.

That sounds like a problem with a shared resource like the database,<=
br>
not something you'll find by profiling the code. =A0You'd be better=
off

either using DBI::Profile or using logging on your database to find

the problem.

- Perrin

There's a neat idea in "Perl =
Hacks" where you nominate every 100th (or whatever) object as a debug =
object. Perhaps you could do something similar with Devel::NYTProf - profil=
e only every 'n' th request during peak times.

I assume you have checked "obvious" things like cpu & dis=
k usage stats, KeepAlive, MaxClients etc

--0016e6d588d2e71b5c046654a075--

Re: Profiling live mod_perl backends

am 30.03.2009 15:14:10 von Cosimo Streppone

In data 30 mars 2009 alle ore 13:46:09, Rolf Banting ha scritto:

> On Sun, Mar 29, 2009 at 9:52 PM, Perrin Harkins
> wrote:
>
>> On Sun, Mar 29, 2009 at 4:44 PM, Cosimo Streppone
>> wrote:
>> > The main problem is that in the past we experienced some kind of
>> > performance problems that only manifested themselves really clearly
>> > in production and only at peak traffic hours.
>> > Out of peak hours, everything was fine.
>>
>> That sounds like a problem with a shared resource like the database,
>> not something you'll find by profiling the code. You'd be better off
>> either using DBI::Profile or using logging on your database to find
>> the problem.

I get the points.

The problem that we had, this was in November last year,
was that all the backends were at load 15.0-20.0 (normal was ~3-4)
after an update to the application.

In those cases, it's pretty clear where the problem is
(CPU/load/etc...). What's not really clear is which point in the
code is causing it.

In our case, it was the code, and particularly a single function,
used in many places, that spent a lot of time doing useless things.
We sorted that out "by intuition", knowing the hot spots of the code.

What I want to do now is to prevent this kind of problems, possibly
in a more systematic and/or scientific way, and I thought of doing
this by running automated performance/stress tests before deployment.

While I try to get there, I thought it might be useful to dedicate 1 of the
live backends to this "live profiling". Even if the application now is
not having any problem, even at peak times.

Maybe I just have to try and see what I get :)

DBI::Profile is also another good "track" to follow.

--
Cosimo

Re: Profiling live mod_perl backends

am 30.03.2009 18:11:40 von Rolf Banting

--001636c599f66f1cff04665856b4
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

On Mon, Mar 30, 2009 at 2:14 PM, Cosimo Streppone wrote:

> In data 30 mars 2009 alle ore 13:46:09, Rolf Banting
> ha scritto:
>
> > On Sun, Mar 29, 2009 at 9:52 PM, Perrin Harkins
> > wrote:
> >
> >> On Sun, Mar 29, 2009 at 4:44 PM, Cosimo Streppone
> >> wrote:
> >> > The main problem is that in the past we experienced some kind of
> >> > performance problems that only manifested themselves really clearly
> >> > in production and only at peak traffic hours.
> >> > Out of peak hours, everything was fine.
> >>
> >> That sounds like a problem with a shared resource like the database,
> >> not something you'll find by profiling the code. You'd be better off
> >> either using DBI::Profile or using logging on your database to find
> >> the problem.
>
> I get the points.
>
> The problem that we had, this was in November last year,
> was that all the backends were at load 15.0-20.0 (normal was ~3-4)
> after an update to the application.
>
> In those cases, it's pretty clear where the problem is
> (CPU/load/etc...). What's not really clear is which point in the
> code is causing it.
>
> In our case, it was the code, and particularly a single function,
> used in many places, that spent a lot of time doing useless things.
> We sorted that out "by intuition", knowing the hot spots of the code.
>
> What I want to do now is to prevent this kind of problems, possibly
> in a more systematic and/or scientific way, and I thought of doing
> this by running automated performance/stress tests before deployment.
>
> While I try to get there, I thought it might be useful to dedicate 1 of the
> live backends to this "live profiling". Even if the application now is
> not having any problem, even at peak times.
>
> Maybe I just have to try and see what I get :)
>
> DBI::Profile is also another good "track" to follow.
>
> --
> Cosimo
>

If you know which routines you want to profile it is easy enough to set up
wrapped versions of the routines which record execution times etc and
install them at run time so the wrapped version gets executed.

Here's some simple test code I (mostly) found in "High Order Perl" by Mark
Jason Dominus:

package TheDude;

sub make_me_proud { return "I love you pops"}

package main;

use Data::Dumper;

my %calls;
my %times;

sub profile {
my ( $func, $name ) = @_;
my $stub = sub {
my $start = time;
## Call the real sub routine
my $return = $func->(@_);
my $end = time;
my $elapsed = $end - $start;
$calls{$name} += 1;
$times{$name}->{'elapsed'} += $elapsed;
return $return;
};
return $stub;
}

## Overwrite the sub routine with a wrapped version
no strict qw(refs);
*{"TheDude::make_me_proud"} =
profile(\&TheDude::make_me_proud,"TheDude::make_me_proud");
#
TheDude::make_me_proud() foreach (1..1000000);

print Dumper \%calls;
print Dumper \%times;

If you put that in a script an run you'll get something like:

$VAR1 = {
'TheDude::make_me_proud' => 1000000
};
$VAR1 = {
'TheDude::make_me_proud' => {
'elapsed' => 3
}
};

Obviously you can customise the routine returned by "profile" to do anything
you like.

Rolf

--001636c599f66f1cff04665856b4
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Mon, Mar 30, 2009 at 2:14 PM, Cosimo =
Streppone <cosi=
mo@streppone.it> wrote:

style=3D"border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8=
ex; padding-left: 1ex;">
In data 30 mars 2009 alle ore 13:46:09, Rolf Banting < rolf.b.mr" target=3D"_blank">rolf.b.mr@ get=3D"_blank">gmail.com> ha scritto:

> On Sun, Mar 29, 2009 at 9:52 PM, Perrin Harkins < pharkins@gmail.com">pharkins@gmail.com>

> wrote:

>

>> On Sun, Mar 29, 2009 at 4:44 PM, Cosimo Streppone < ailto:cosimo@streppone.it">cosimo@streppone.it>

>> wrote:

>> > The main problem is that in the past we experienced some kind=
of

>> > performance problems that only manifested themselves really c=
learly

>> > in production and only at peak traffic hours.

>> > Out of peak hours, everything was fine.

>>

>> That sounds like a problem with a shared resource like the databas=
e,

>> not something you'll find by profiling the code. =A0You'd =
be better off

>> either using DBI::Profile or using logging on your database to fin=
d

>> the problem.

I get the points.

The problem that we had, this was in November last year,

was that all the backends were at load 15.0-20.0 (normal was ~3-4)

after an update to the application.

In those cases, it's pretty clear where the problem is

(CPU/load/etc...). What's not really clear is which point in the

code is causing it.

In our case, it was the code, and particularly a single function,

used in many places, that spent a lot of time doing useless things.

We sorted that out "by intuition", knowing the hot spots of the c=
ode.

What I want to do now is to prevent this kind of problems, possibly

in a more systematic and/or scientific way, and I thought of doing

this by running automated performance/stress tests before deployment.

While I try to get there, I thought it might be useful to dedicate 1 of the=

live backends to this "live profiling". Even if the application n=
ow is

not having any problem, even at peak times.

Maybe I just have to try and see what I get :)

DBI::Profile is also another good "track" to follow.

--

Cosimo

If you know which routines you want to profil=
e it is easy enough to set up wrapped versions of the routines which record=
execution times etc and install them at run time so the wrapped version ge=
ts executed.

Here's some simple test code I (mostly) found in "High Order P=
erl" by Mark Jason Dominus:

package TheDude;

sub make_me=
_proud { return "I love you pops"}

package main;

use Data::Dumper;

my %calls;
my %times;

sub profile {
=
  =A0 my ( $func, $name ) =3D @_;
  =A0 my $stub =3D sub {
=
      =A0 my $start   =3D time;
## Call the real sub rou=
tine
      =A0 my $return=A0 =3D $func->(@_);

      =A0 my $end     =3D time;
      =
=A0 my $elapsed =3D $end - $start;
      =A0 $calls{$name} +=
=3D 1;
      =A0 $times{$name}->{'elapsed'}=A0 +=
=3D $elapsed;
      =A0 return $return;
  =A0 };
=
  =A0 return $stub;

}

## Overwrite the sub routine with a wrapped version
no strict q=
w(refs);
*{"TheDude::make_me_proud"} =3D profile(\&TheDude=
::make_me_proud,"TheDude::make_me_proud");
#
TheDude::make_=
me_proud() foreach (1..1000000);

print Dumper \%calls;
print Dumper \%times;

If you put that i=
n a script an run you'll get something like:

$VAR1 =3D {
=A0=
         'TheDude::make_me_proud' =3D> 1000000 r>      =A0 };
$VAR1 =3D {

        =A0 'TheDude::make_me_proud' =3D> {
=
                         =A0=
               'elapsed' =3D> 3
=
                         =A0=
             }
      =A0 };

O=
bviously you can customise the routine returned by "profile" to d=
o anything you like.

Rolf

--001636c599f66f1cff04665856b4--

Re: Profiling live mod_perl backends

am 30.03.2009 18:24:01 von Perrin Harkins

On Mon, Mar 30, 2009 at 9:14 AM, Cosimo Streppone wrote:
> The problem that we had, this was in November last year,
> was that all the backends were at load 15.0-20.0 (normal was ~3-4)
> after an update to the application.

That's pretty rare (hitting a CPU problem), and it sounds severe
enough that your idea of automated stress tests before deployment
would work for catching it.

> While I try to get there, I thought it might be useful to dedicate 1 of the
> live backends to this "live profiling". Even if the application now is
> not having any problem, even at peak times.

Sure, give it a shot. My suggestion would be to not collect too much
data before analyzing it though, at least at first. Devel::NYTProf
can take a really long time if you feed it more than a few requests at
once.

- Perrin